Skip to content

Conversation

@joecodecreations
Copy link

Adding in additional .jsonl data for 52k dataset from alpaca project

@orangetin
Copy link
Member

orangetin commented Mar 31, 2023

Hey, great PR! Unfortunately, the Alpaca data set is under a Creative Commons Non-Commercial license: https://github.com/tatsu-lab/stanford_alpaca/blob/aa65c492bb788e144712daab42bc5d11c2761591/DATA_LICENSE

This issue mentions them working on changing the license: tatsu-lab/stanford_alpaca#25 (comment)

As mentioned in this comment, open source licensed data sets are preferred.

It would be great if Alpaca changed the license, but as it stands right now, it's limited.

Maybe someone else can comment on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants